A COMPREHENSIVE STUDY ON BIG DATA
Faculty Mentor:
Mr. Sanjive Saxena
Student Name:
Akash Bhardwaj (MCA-3 rd Year)
Diksha Singla (MCA-3 rd Year)
1.INTRODUCTION
Constructing advanced analytical applications which are based on new type of data for servicing of customers and driving a better competitive advantage is all about big data. Big data describes substantial volume of data which can be both structured and unstructured. This huge amount of data collected by different organizations is analyzed for insight which leads to great resolutions and also strategic movement of businesses.
2.PILLARS OF BIG DATA
Pillars of Big Data are dependent on 5 Vs
Volume: data is collected from various sources which include business transactions, social media and sensors information or binary to binary data. Technology like Hadoop has eased the organizations burden in storing this type of data.
Velocity: Velocity refers how rapidly data is getting generated and how quickly it is getting organized to meet demands and determining the actual capability that data holds. Business processes, application logs, networks and social media sites, sensors, mobile devices etc. are various sources which contributes massive and continuous data which is dealt by velocity.
Variety: There occur various varieties of data such as structured, numeric, unstructured text documents, email, video, audio, stock ticker and financial transactions. This kind of heterogeneous data causes issues in storing, mining and analyzing of data.
Variability: It deals with inconsistent nature which can be manifested by data sometimes, which makes it complicated to handle and supervise it more effectively.
Value: As data comes from numerous sources, so it becomes tedious job to perform linking, matching, cleaning and transforming it to other forms. Nevertheless, its essential to affix data by forming hierarchies or numerous data linkages and extracting useful information.
3. BIG DATA COMPRISES OF
Comparative Analysis: It incorporates inquisition of customer accomplishment measures, real time consumer commitments observation for comparison of organizations products services and brand authorities with their competitors.
Social Media Listening: Gathering information about particular business or product occurs over social media which benefits in recognizing target audiences for marketing campaigns by discovering activities surrounding distinct thesis across diverse sources.
Marketing Analysis: this encompasses information which is used for latest outcomes, utilities and originality.
Customer Satisfaction: Gathered information unveils consumers sentiments about proprietary name, if any probable controversy comes into consideration then how fidelity of the proprietary name will be maintained and how consumer management efforts can be ameliorated.
4. VARIATIONS OF DATA TYPES
Big data surrounds extensive variations of data types which incorporate:
Structured Data: - data present in SQL databases, data warehouses and data lakes.
Unstructured Data: - text and documentation files clenched in Hadoop clusters or NoSQL systems.
Semi Structured Data: - information gathered from web server history or sensors streaming data.
5. TECHNOLOGIES BIG DATA SUPPORTS
Several technologies exist which is supported by big data such as Hadoop ecosystem, apache spark, data lakes, NoSQL databases, machine learning, text mining, data warehouse, data mining, predictive analysis and in-memory databases.
Hadoop Ecosystem: It develops open source software for scalable, distributed computing. It is a framework which enables distributed processing of huge data sets across clusters of computers using programming models.
Apache Spark: It is an open source cluster-computing framework which serves as an engine for processing data within Hadoop. It provides native bindings for java, Scala, python and R and supports SQL as well with streaming data, machine learning and graph processing
Data Lakes: Storage repositories which hold extremely large volumes of raw data in its native format until data is needed by business users. They are designed to make it easier for users to access vast amount of data when the need arises.
NoSQL Databases: these databases address high operational speed and great flexibility. They can be scaled horizontally across hundreds and thousands of servers.
In-Memory Databases: it relies on main memory rather than disk for data storage. They are faster than disk-optimized databases; it uses data warehouses and data marts creations for big data analytics.
Machine Learning: Machine models are trained on huge data sets by building precise models which deliver faster and more accurate results and helps organizations in identifying profitable opportunities which help in risk management.
Text Mining: Analyzing text data gathered from web, comments, books and other text-contained sources by using natural language processing and machine learning to discover relationships between data.
Data Warehouse: all quality measures need to be followed before analyzing constantly flowing organization data. Once reliability is reached then a database must be established in order to get the enterprise on same page.
Data Mining: Examining large data by discovering patterns in large data sets which helps in business complexities by using data mining softwares such as Weka.
Predictive Analysis: for analyzing future outcomes of data machine learning techniques with statistical algorithms are used to analyses data which helps in fraud detection, risk and marketing.
6. BIG DATA SIGNIFICANCE FOR LARGE ORGANIZATIONS
Using big data analytics, organizations analyses their stored data to identify opportunities which leads them in moving forward smartly, gaining better profits, making their consumers satisfied and doing more efficient operations. Big data provides value in following ways:
Cost Reduction: considerable cost reduction dominates by storing big data which is advantageous for running business processes which is achieved by using cloud-based analytics or technology like Hadoop.
Rapid or effective Decision Making: In-memory databases and Hadoop pace unites to analyses data sources that helps in rapid decision making and makes businesses analyses information instantly.
Latest Commodities: By analyzing gathered information, organizations get the capability to estimate consumer satisfaction and desires which powers them to offer customers what they crave for. Companies are creating more commodities to convince their patrons.
7. BIG DATA MERITS
Big Data has the capability to supply companies with esteemed insights into their customers which helps in refinement of marketing campaigns and techniques and helps in enhancing customer engagements and conversion rates. Business organizations working on big data utilizations hold competitive edge over those who dont implement their businesses over big data and also it empowers them to become customer focused rapidly. Real time user data can be used to evaluate preferences of customers, which allows business organizations to upgrade their marketing strategies, embrace consumer desires more sensitively and providing better operational effectiveness.
8. BIG DATA ANALYSIS FROM HUMAN PERSPECTIVE
Eventually, the efficacious and value of big data rely on how efficiently the data is understood while constructing queries to analyses big data analytics projects. Big data tools help non-technical users to use the gathered information for predictive analysis applications by reaching professional niches.
9. REFERENCES
[1] http://insight360.com/big-data/data-processing/
[2] https://www.guru99.com/what-is-big-data.html
[3] https://www.sas.com/en_in/insights/big-data/what-is-big-data.html
[4] https://www.oracle.com/big-data/guide/what-is-big-data.html
https://www.infoworld.com/article/3220044/what-is-big-data-analytics-everything-you-need-to-know.html